The diagnostic potential of urine in paediatric patients undergoing initial treatment for tuberculous meningitis

Tuberculous meningitis (TBM)—the extrapulmonary form of tuberculosis, is the most severe complication associated with tuberculosis, particularly in infants and children. The gold standard for the diagnosis of TBM requires cerebrospinal fluid (CSF) through lumbar puncture—an invasive sample collection method, and currently available CSF assays are often not sufficient for a definitive TBM diagnosis. Urine is metabolite-rich and relatively unexplored in terms of its potential to diagnose neuroinfectious diseases. We used an untargeted proton magnetic resonance (1H-NMR) metabolomics approach to compare the urine from 32 patients with TBM (stratified into stages 1, 2 and 3) against that from 39 controls in a South African paediatric cohort. Significant spectral bins had to satisfy three of our four strict cut-off quantitative statistical criteria. Five significant biological metabolites were identified—1-methylnicotinamide, 3-hydroxyisovaleric acid, 5-aminolevulinic acid, N-acetylglutamine and methanol—which had no correlation with medication metabolites. ROC analysis revealed that methanol lacked diagnostic sensitivity, but the other four metabolites showed good diagnostic potential. Furthermore, we compared mild (stage 1) TBM and severe (stages 2 and 3) TBM, and our multivariate metabolic model could successfully classify severe but not mild TBM. Our results show that urine can potentially be used to diagnose severe TBM.


Sample collection
This study compared the urine of a control group (n = 39) with that of an experimental group of paediatric patients (n = 32).The controls, anonymously collected with written and informed consent, were classified as 'normal' paediatric patients who were admitted to Tygerberg Hospital.Here, 'normal' is based on the following benchmark: paediatric patients without meningitis, without neurological symptoms, and from the same geographic location as the patients with TBM.No further clinical data on the controls were available due to ethical constraints.The group of patients with TBM was subdivided into stage 1 TBM (n = 8), stage 2 TBM (n = 10) and stage 3 TBM (n = 14).Clinical symptoms included headache, fever, nausea/vomiting, photophobia, meningeal irritation, and/or neck stiffness 9 .The first urine sample after hospital discharge was obtained and classified as time point 1 (T1).Table 1 provides a more detailed description of the demographics and clinical information of the TBM patients.

Sample transport, storage, and handling
The urine samples were stored at − 80 °C in a dedicated freezer at Stellenbosch University's Division of Molecular Biology and Human Genetics.These samples were then collectively transferred overnight-frozen on dry ice-to the BSL-3 Laboratory at Human Metabolomics North-West University, Potchefstroom campus and stored at − 80 °C.On the day that the metabolomics analyses were performed, the samples were allowed to thaw in a biological safety cabinet.A pooled quality control (QC) urine sample was created by aliquoting 50 µL of each sample and pooling them into a single tube.

Sample preparation and 1 H NMR analysis
Prior to analysis, all urine samples were thawed to room temperature.To remove any particulates and macromolecules, 600 µL of urine was centrifuged at 12,000 g for 5 min.A 540 µL volume of supernatant was added to Table 1.Demographic and clinical characteristics of the TBM patients in this study stratified into stages 1, 2 and 3. CXR chest radiography, GCS Glasgow coma scale, Sex (M = male, F = female), CSF cerebrospinal fluid, ICP intracranial pressure, VP Ventriculoperitoneal. *Median with interquartile range.www.nature.com/scientificreports/water peak at ~ 4.72 ppm were removed.The coefficient of variation (CV) was checked in all bins for the QC samples, and any bin with a QC CV > 10% was removed.Therefore, all remaining bins were considered reliable (< 10% analytical variation) for biological interpretation.

Statistical analysis
All the statistical analyses were performed with Microsoft Excel, MetaboAnalyst 5.0, an online metabolomics tool (www.metab oanal yst.ca/), and GraphPad Prism 10.The data were log transformed and Pareto scaled for all multivariate analyses [principal component analysis (PCA), orthogonal partial least squares discriminant analysis (OPLSDA), and random forest (RF) alongside OPLSDA to support the results (discriminatory variables were compared)], while all univariate statistics were analysed using nonparametric tests.Unsupervised PCA was used to obtain a visual overview of the data-qualitative assessment of group differentiation and/or clustering.Subsequently, four quantitative statistical tests were performed with defined and strict critical cut-off criteria.A spectral bin was considered important for further investigation of its diagnostic potential if it met at least three of the following four criteria: 1) OPLSDA with a VIP > 1.5 (supported by an RF out-of-the-bag (OOB) value < 0.05), 2) a Wilcoxon t test with an FDR p value < 0.01, 3) a fold change > 5.0, or 4) Hedge's effect size g value > 1.0.The metabolites within the important bins were identified based on 1D and 2D 1 H-NMR pure compound spectral libraries, and the metabolite concentrations were calculated as mmol/mol creatinine.Subsequently, violin plots were used to illustrate the distribution of the frequency of the concentrations of these important metabolites, as well as their medians, interquartile ranges, and statistical significance (p < 0.01, with multiple test corrections).
Receiver operating characteristic (ROC) curves of the most important biological variables were generated to assess their diagnostic potential, and correlation analysis was performed to assess whether the biological diagnostic markers and unknowns had any strong correlations (r > 0.6), with statistical significance (p < 0.05) with the metabolites of the medication.Figure 1 illustrates the overall experimental design.

Quality assurance
A total of 21 QC samples were run at regular intervals throughout the 1 H-NMR analysis.Quantitatively, the coefficient of variation (CV) was calculated in Excel for all spectral bins, and the bins with a QC CV greater than 10% (i.e., more than 10% analytical variation, likely due to horizontal shifting in the 1 H-NMR spectra in regions that contain peaks that are sensitive to slight pH differences) were removed from the entire data matrix.Hence, we can assure the quality of the remaining data and that we will assess biological variances with negligible analytical variation present in the data.Furthermore, qualitatively, an evaluation of the PCA scores plot (Figure S1) showed that the QCs clustered closely together; therefore, no experimental drift of the analytical pipeline was observed in this study.

Qualitative overview: multivariate statistics
The unsupervised PCA scores plots show natural differentiation between controls and all stages of TBM (Fig. 2A), as well as between individual comparisons (Figs.2B-D).These qualitative results provide confidence in proceeding with additional quantitative statistical analyses-OPLSDA with RF and univariate assessment.

Quantitative statistical evaluation
Four quantitative statistical tests, with strict critical cut-off criteria, were evaluated in our binned 1 H-NMR data set: 1) OPLSDA with a VIP > 1.5 (supported by RF with similar discriminatory bins and OOB values < 0.05-Fig.3), 2) Wilcoxon t test with FDR p value < 0.01, 3) fold change > 5.0, and 4) Hedge's effect size g value > 1.0.Comparisons were made between the controls and each stage of the TBM individually.If a bin achieved at least three of the four critical cut-off criteria, then that bin was considered important for the investigation of diagnostic markers.
Table 2 shows a summary of these statistically significant quantitative results.As expected, several anti-TB drugs and their metabolites were identified as significant in differentiating TBM patients on treatment from controls.Of interest to our study were the biological elements identified as significant: N-acetylglutamine, succinic acid, citric acid, 5-aminolevulinic acid, 1-methylnicotinamide, and quinolinic acid.Furthermore, 14 unknown compounds that were not found in our spectral library databases were identified as important.

Correlations
We also performed a correlation analysis of all the significant variables identified in Table 2 for all patients with TBM (Fig. 4).Of the five biological diagnostic metabolic markers, only 3-hydroxyisovaleric acid showed any strong correlations (r > 0.6), with statistical significance (p-value < 0.05), with other significant variables, namely, propylene glycol and unknowns 4, 6 and 9. Hence, only 3-hydroxyisovaleric acid showed some correlation with a known medication compound (propylene glycol).The other four diagnostic markers, 1-methylnicotinamide, 5-aminolevulinic acid, N-acetylglutamine and methanol, did not show strong and/or statistically significant correlations with other significant variables.These data support our argument that 1-methylnicotinamide, 5-aminolevulinic acid, and N-acetylglutamine are diagnostic markers of TBM and are not directly associated with medication.

Evaluation of diagnostic markers for TBM
Since most of the significant variables annotated in Table 2 were medications and their known metabolites, we reran the statistical pipeline on a reduced data set in which all the bins related to known medications were removed.Furthermore, Fig. 2 and Table 2 show that TBM stages 2 and 3 exhibit similar characteristics.Therefore, to improve the power of the analysis, TBM stages 2 and 3 were combined and labelled 'severe TBM' , and the remaining cases of stage 1 TBM were labelled 'mild TBM' .Figure 5 shows the PCA score plots for this revised data set-controls vs mild TBM show some overlap, and controls vs severe TBM show complete natural differentiation.Random Forest showed that mild TBM could not be successfully classified using the current multivariate metabolic model (71.4% misclassified and OOB = 0.109).However, the random forest misclassified only one severe TBM case (4% misclassification) and had an OOB value < 0.05.Therefore, our multivariate metabolic model can successfully classify severe but not mild TBM.Furthermore, compared with the OPLSDA results, the random forest results showed similar discriminating bins.
Based on univariate statistics, using the Wilcoxon t test with an FDR p value < 0.01 and Hedge's effect size g value > 2.0, several bins were identified as statistically and practically significant in differentiating controls from mild and severe TBM patients.Table 3 shows these quantitative statistical values, along with the annotations of the metabolites.
Five biological metabolites from Table 3 (1-methylnicotinamide, 3-hydroxyisovaleric acid, 5-aminolevulinic acid, N-acetylglutamine and methanol) were quantified (mmol/mol creatinine); the violin plots are presented in Fig. 6.ROC curve analysis of the diagnostic potential of these five metabolites was carried out based on the area under the curve (AUC), sensitivity, and specificity (Fig. 7 and Table 4).
As shown in Fig. 7, all five metabolites have diagnostic potential.1-Methylnicotinamide had AUCs of 0.99 and 0.95 for mild TBM and severe TBM, respectively; 3-hydroxyisovaleric acid had AUCs of 0.89 and 0.99 for mild TBM and severe TBM, respectively; 5-aminolevulinic acid had AUCs of 0.94 and 0.99 for mild TBM and severe TBM, respectively; N-acetylglutamine had AUCs of 0.99 and 0.99 for mild TBM and severe TBM, respectively; and methanol had AUCs of 0.92 and 0.98 for mild TBM and severe TBM, respectively.Table 4 shows the sensitivity of the specific cut-off concentrations, all with a specificity of 97.5% and a 95% confidence interval of 87%-100%.
1-Methylnicotinamide and N-acetylglutamine have 100% sensitivity as early diagnostic markers of mild TBM and > 90% sensitivity for severe TBM, while 3-hydroxyisovaleric acid has greater sensitivity for severe TBM, and 5-aminolevulinic acid has equivalent sensitivity as a diagnostic marker for both mild and severe TBM.Therefore, 1-methylnicotinamide, 3-hydroxyisovaleric acid, 5-aminolevulinic acid and N-acetylglutamine show potential as diagnostic markers of TBM but do not have sufficient power to differentiate mild-stage from severe-stage TBM.Methanol showed statistical significance (Fig. 6) but lacked diagnostic sensitivity for use as a diagnostic marker for TBM.

Biological context for the four TBM diagnostic metabolites
Three (1-methylnicotinamide, 3-hydroxyisovaleric acid, and N-acetylglutamine) of the four urinary metabolic markers of TBM, as described above, have been discussed in terms of their biological context in our previous study-see 23 for more details.Here, we report 5-aminolevulinic acid as being a significant urinary metabolite in TBM for the first time.5-Aminolevulinic acid has been defined as a non-proteinogenic amino acid, meaning it is not used in the synthesis of proteins.This endogenous metabolite is biosynthesized in the body by the condensation of glycine and succinyl-CoA, catalysed by the enzyme 5-aminolevulinate synthase.5-Aminolevulinic acid is required in the metabolism of heme in the body and it is involved in maintaining normal mitochondrial function.
To the best of our knowledge, this is the first time that 5-aminolevulinic acid has been identified in any type of sample matrix collected from TBM patients.However, it is important to state that 5-aminolevulinic acid is not a specific marker of TBM.5-Aminolevulinic acid has been identified in the urine of patients following an acute attack of porphyria 32 , as well as in patients diagnosed with the tyrosinemia type I 33 .Moreover, due www.nature.com/scientificreports/ to its anti-inflammatory and immunoregulatory properties 34 , 5-aminolevulinic acid has been identified as a novel therapeutic for inflammatory bowel disease 35 , and has potential in treating type II diabetes mellitus 36 , and COVID-19 37 .Hence, 5-aminolevulinic acid is likely an endogenous metabolic marker of the immune response to severe inflammation (severe and chronic neuroinflammation is a classic symptom of TBM).But, it is also important to note that some studies 38 have identified 5-aminolevulinic acid as being neurotoxic.Therefore, future studies should examine the levels of this metabolite in the brain of TBM patients.

TBM stage differentiation
In addition to identifying potential diagnostic markers in each TBM stage (comparing TBM patients with controls), the TBM stages were also compared to each other to determine whether any urinary metabolic marker could differentiate the TBM stages.Based on the PCA score plot (Figure S2), no qualitative differentiation could be detected between the stages of TBM (i.e., the metabolic urinary 1 H-NMR profiles could not differentiate the stage of TBM).Quantitative statistical data were checked and did not show statistically significant differences.Therefore, no urinary metabolites could differentiate the stages of TBM.

Unknown compounds-Biological or medication related?
Ten significant unknown compounds were identified, as shown in Table 3, and quantified.Based on the correlation data (Fig. 4), none of the unknowns were correlated with the identified anti-TB medications.However, unknowns 6, 9 and 10 were significantly correlated with propylene glycol and may be related to other forms of medication.The other unknowns (1, 2, 3, 4, 5, 7 and 8) had no correlation with medication and may be of biological origin.

Speculation of the identities of unknowns
Based on the 2D JRES and COSY NMR spectral data, a doublet at 1.55 ppm (Unknown 2) was correlated with a quartet at 5.19 ppm (Unknown 8) (see Figure S3 in the supplementary information for more details).This 1 H-NMR chemical shift information implies that a terminal methyl group is present next to a double bond CH component.This chemical information suggests that one of the unknowns is likely to have a chemical structure similar to that of a short-chain unsaturated carboxylic acid.The 1 H-NMR spectra of several suspected short-chain unsaturated carboxylic acids were investigated, including propionic acid (C3), butyric acid (C4), and isobutyric acid (C4).However, while the patterns were similar, none of these pure compound spectra matched the unknown spectra sufficiently.We suspect that the unknown compound that contains a doublet at 1.55 ppm and a quartet at 5.19 ppm is a short branched-chain organic acid that is methylated.Short branched-chain organic acids originate from micro-organisms, and these unknowns unique to the TBM cases could be degradation products of the unique cell wall of M.tb.It is our recommendation that additional cohorts of urine samples collected from treated TBM patients should undergo targeted gas chromatography-mass spectrometry (GC-MS) metabolomics analyses, with a focus on short and branched-chain organic acids that are methylated.

Conclusions
The identification of urinary biomarkers for TBM is a promising area of research.Our multivariate metabolic model can successfully classify severe but not mild TBM.The four metabolites (1-methylnicotinamide, 3-hydroxyisovaleric acid, 5-aminolevulinic acid and N-acetylglutamine) identified in this study show good diagnostic potential for severe TBM (stage 2 and 3 TBM patients combined), but they have not yet been established as definitive diagnostic tools.Future studies are needed to validate these biomarkers, including the significant Table 2. Quantitative statistically significant results (achieved 3 of 4 demarcated critical cut-off criteria of controls compared to TBM stages 1, 2 and 3).The annotations given to the bins were based upon 1D and 2D 1 H-NMR spectral library matches.VIP variables important in projection (OPLSDA).FDR Wilcoxon p value adjusted for multiple tests.FC fold change.ES Hedge's g value effect size.X indicates an unknown compound.unknowns identified in this study, to determine their clinical utility and explore their specificity and sensitivity for the diagnosis of TBM.Further research is needed, but we believe that urine from TBM patients will potentially aid in the earlier diagnosis and treatment of this disease.2 for all the TBM cases.

Fig. 2 .
Fig. 2. PCA score plots of (A) Controls vs. all three TBM stages, (B) Controls vs. Stage 1 TBM, (C) Controls vs. Stage 2 TBM, (D) Controls vs. Stage 3 TBM.These PCA scores plots give qualitative confirmation that natural differentiation exists between the control group and each TBM stage, supporting the use of quantitative multivariate statistical analysis.

Fig. 3 .
Fig. 3. Random forest outputs supporting the OPLSDA results.The results are shown in the left, centre, and right panels for the Stage 1 TBM, Stage 2 TBM, and Stage 3 TBM, respectively.The top part of each panel shows that similar discriminating bins were found in comparison to the OPLSDA results, and the out-of-bag (OOB) values for each panel are < 0.05, supporting the OPLSDA model.The bottom part of each panel shows the Random forest classification: stage 1 TBM cases were misclassified 25% of the time as controls, and no misclassifications occurred for stage 2 TBM, while stage 3 TBM cases were misclassified 14.3% of the time as controls.

Fig. 4 .
Fig. 4. Plot showing correlations with an r value greater than 0.6 and a p value < 0.05 for all important quantified variables in Table2for all the TBM cases.

Fig. 5 .
Fig. 5. PCA score plots of revised data sets of (LEFT) controls vs mild TBM, showing some overlap, and (RIGHT) controls vs severe TBM, showing complete natural differentiation between the groups.

Table 3 .
Statistically and practically significant bins of the reduced data set based on the Wilcoxon t test with an FDR p value < 0.01 and Hedge's effect size g value > 2.0.Annotations given to bins, based upon 1D and 2D 1 H-NMR spectral library matches.The Wilcoxon p values, adjusted for multiple tests, was < 0.01 for all comparisons.ES Hedge's g value effect size.X indicates an unknown compound.